Skip to content

fix: prevent vllm_omni from breaking Qwen3 vLLM inference#79

Merged
timzsu merged 3 commits into
mainfrom
zsu/fix-inference-bug
Jun 21, 2026
Merged

fix: prevent vllm_omni from breaking Qwen3 vLLM inference#79
timzsu merged 3 commits into
mainfrom
zsu/fix-inference-bug

Conversation

@timzsu

@timzsu timzsu commented Jun 19, 2026

Copy link
Copy Markdown
Collaborator

Purpose

When vLLM Omni 0.22.0 is installed, vLLM 0.22.0 fails to initialize an engine for models using Model Runner V2 (e.g., Qwen3-8B). This PR implements a workaround. Note that I have also fixed the root cause upstream (vllm-project/vllm-omni#4568), so we can revert the patch after we bump the versions next time.

Changes

  • src/worker/executors/__init__.py: Import vLLM Omni lazily so the plugin is not registered when we are not using the Omni executor.
  • src/worker/executors/vllm_executor.py: Removes the Omni plugin from vLLM if we are using VLLMExecutor.
  • examples/templates/inference_vllm_qwen3.yaml: A sample workflow to surface the problem.
  • The other three files are adjustments to fix lint and new unit tests.

Test Plan

End-to-end execution of examples/templates/inference_vllm_qwen3.yaml.

Test Result

The workflow completes successfully.


Pre-submission Checklist
  • I have read the contribution guidelines.
  • I have run pre-commit run --all-files and fixed any issues.
  • I have added or updated tests covering my changes (if applicable).
  • I have verified that uv run pytest tests/ passes locally.
  • If I changed shared schemas or proto definitions, I have checked downstream compatibility across Server and Worker.
  • If I changed the SDK or CLI, I have verified the affected packages work (uv sync --all-packages --group ci --frozen).
  • If this is a breaking change, I have prefixed the PR title with [BREAKING] and described migration steps above.
  • I have updated documentation or config examples if user-facing behavior changed.

@timzsu timzsu marked this pull request as ready for review June 19, 2026 11:31
@timzsu timzsu requested a review from kaiitunnz as a code owner June 19, 2026 11:31

@kaiitunnz kaiitunnz left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few comments.

Comment thread src/worker/executors/vllm_executor.py Outdated
Comment thread src/worker/executors/__init__.py Outdated
Comment thread tests/worker/test_vllm_scoped_plugins.py Outdated
Comment thread tests/worker/test_vllm_scoped_plugins.py Outdated
Comment thread tests/worker/test_vllm_scoped_plugins.py Outdated
Comment thread examples/templates/inference_vllm_qwen3.yaml Outdated
Comment thread examples/templates/inference_vllm_qwen3.yaml Outdated
Comment thread examples/templates/inference_vllm_qwen3.yaml Outdated
@timzsu timzsu requested a review from kaiitunnz June 20, 2026 10:06
timzsu added 3 commits June 20, 2026 19:36
Signed-off-by: Zhengyuan Su <su.zhengyuan@u.nus.edu>
Signed-off-by: Zhengyuan Su <su.zhengyuan@u.nus.edu>
Signed-off-by: Zhengyuan Su <su.zhengyuan@u.nus.edu>
@timzsu timzsu force-pushed the zsu/fix-inference-bug branch from f7e9694 to c321535 Compare June 20, 2026 11:36

@kaiitunnz kaiitunnz left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@timzsu timzsu merged commit 302f675 into main Jun 21, 2026
11 of 13 checks passed
@timzsu timzsu deleted the zsu/fix-inference-bug branch June 21, 2026 01:51
@kaiitunnz kaiitunnz mentioned this pull request Jun 21, 2026
8 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants